syntactic surprisal
A Quantum-Inspired Analysis of Human Disambiguation Processes
Formal languages are essential for computer programming and are constructed to be easily processed by computers. In contrast, natural languages are much more challenging and instigated the field of Natural Language Processing (NLP). One major obstacle is the ubiquity of ambiguities. Recent advances in NLP have led to the development of large language models, which can resolve ambiguities with high accuracy. At the same time, quantum computers have gained much attention in recent years as they can solve some computational problems faster than classical computers. This new computing paradigm has reached the fields of machine learning and NLP, where hybrid classical-quantum learning algorithms have emerged. However, more research is needed to identify which NLP tasks could benefit from a genuine quantum advantage. In this thesis, we applied formalisms arising from foundational quantum mechanics, such as contextuality and causality, to study ambiguities arising from linguistics. By doing so, we also reproduced psycholinguistic results relating to the human disambiguation process. These results were subsequently used to predict human behaviour and outperformed current NLP methods.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.45)
- Government (0.92)
- Media > Film (0.45)
- Leisure & Entertainment > Sports (0.45)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (4 more...)
Multipath parsing in the brain
Franzluebbers, Berta, Dunagan, Donald, Stanojević, Miloš, Buys, Jan, Hale, John T.
Humans understand sentences word-by-word, in the order that they hear them. This incrementality entails resolving temporary ambiguities about syntactic relationships. We investigate how humans process these syntactic ambiguities by correlating predictions from incremental generative dependency parsers with timecourse data from people undergoing functional neuroimaging while listening to an audiobook. In particular, we compare competing hypotheses regarding the number of developing syntactic analyses in play during word-by-word comprehension: one vs more than one. This comparison involves evaluating syntactic surprisal from a state-of-the-art dependency parser with LLM-adapted encodings against an existing fMRI dataset. In both English and Chinese data, we find evidence for multipath parsing. Brain regions associated with this multipath effect include bilateral superior temporal gyrus.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (13 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.89)
Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities
Arehalli, Suhas, Dillon, Brian, Linzen, Tal
Humans exhibit garden path effects: When reading sentences that are temporarily structurally ambiguous, they slow down when the structure is disambiguated in favor of the less preferred alternative. Surprisal theory (Hale, 2001; Levy, 2008), a prominent explanation of this finding, proposes that these slowdowns are due to the unpredictability of each of the words that occur in these sentences. Challenging this hypothesis, van Schijndel & Linzen (2021) find that estimates of the cost of word predictability derived from language models severely underestimate the magnitude of human garden path effects. In this work, we consider whether this underestimation is due to the fact that humans weight syntactic factors in their predictions more highly than language models do. We propose a method for estimating syntactic predictability from a language model, allowing us to weigh the cost of lexical and syntactic predictability independently. We find that treating syntactic predictability independently from lexical predictability indeed results in larger estimates of garden path. At the same time, even when syntactic predictability is independently weighted, surprisal still greatly underestimate the magnitude of human garden path effects. Our results support the hypothesis that predictability is not the only factor responsible for the processing cost associated with garden path sentences.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
- Asia > India > Karnataka > Bengaluru (0.04)
- (7 more...)
- Research Report > New Finding (0.70)
- Research Report > Experimental Study (0.49)